2024-08-20
Figure 1. Book cover
Figure 2. Photo of John Chambers
Figure 3. Photo of Richard Becker
Figure 4. Photo of Allan Wilks
Figure 5. Aerial photograph of Bell Laboratories
Figure 6. Venables and Ripley book cover
Figure 7. Excerpt from research paper
Figure 8. CD of release 1.0 of R
Figure 9. Excerpt from New York Times article
Figure 10. Excerpt from website
Figure 11. Excerpt from article
Figure 12. Excerpt from website
Figure 13. Excerpt from website
Figure 14. Excerpt from website
Figure 15. Excerpt from website
Figure 16. Excerpt from website
Figure 17. Excerpt from website
Figure 18. Title slide from Frank Harrell talk
Figure 19. Title slide from presentation
Figure 20. Hex sticker for tidyverse
Figure 21. Hex sticker for dplyr
Figure 22. Hex sticker for ggplot2
Figure 23. Hex sticker for magrittr
Figure 24. Hex sticker for readr
Figure 25. Hex sticker for stingr
Figure 26. Hex sticker for tibble
Figure 27. Hex sticker for tidyr
Figure 28. Exceprt from github site
Figure 29. Hex sticker for knitr
Figure 30. Hex sticker for bookdown
Figure 31. Excerpt from Posit blog
Figure 32. Hex sticker for Quarto
Figure 33. Excerpt from README file
Figure 34. Excerpt from blog post
Figure 35. Title slide from presentation
Figure 36. Excerpt from blog post
Figure 37. Excerpt from research paper
Figure 38. Excerpt from paper
Figure 39. Excerpt from website
All infants born in the state of Missouri during the 1995 calendar year who have one or more visits to the Emergency room during their first year of life.
Type your ideas in the chat box.
A research paper computes a p-value of 0.45. How would you interpret this p-value?
Figure 1: xkcd cartoon about jelly beans and cancer
Figure 2: Interval that contains the null value
Figure 3: Interval entirely above the null value
Figure 4: Interval entirely below the null value
Figure 5: Interval entirely inside the range of clinical indifference
Figure 6: Interval partly inside/outside range of clinical indifference
A research paper computes a confidence interval for a relative risk of 0.82 to 3.94. This confidence interval tells that the result is
Figure 7: Confidence interval entirely inside the range of clinical indifference
Figure 8: Confidence interval entirely outside the range of clinical indifference
This file was written by Steve Simon on 2023-08-15 with the last major revision on 2024-08-20. It is in the public domain and you can use it any way you please.
data_dictionary: albuquerque-housing
format:
txt: tab-delimited
csv: comma-delimited
sas7bdat: proprietary SAS
sav: proprietary SPSS
varnames: first row of data
missing_value_code: '.'
description: |
From the original source (no longer available) A random sample of records of resales of homes from Feb 15 to Apr 30, 1993 from the files maintained by the Albuquerque Board of Realtors. This type of data is collected by multiple listing agencies in many cities and is used by realtors as an information base.
download_url: https://raw.githubusercontent.com/pmean/datasets/master/albuquerque-housing.csv
source: |
DASL (Data and Story Library), a repository for various data sets useful for teaching. This file was lost in the transition of DASL from statlib to datadescription.
copyright: |
Unknown. You should be able to use this data for individual educational purposes under the Fair Use guidelines of U.S. copyright law.
size:
rows: 117
columns: 7
price:
label: Sales price of house
scale: ratio
unit: dollars
sqft:
label: Square footage of house
scale: ratio
unit: square feet
age:
label: Age of house
scale: ratio
unit: years
features:
label: Number of features of house
scale: ratio
range: 0 to 13
northeast:
label: Is house located in Northeast Albuquerque?
scale: nominal
value: yes/no
custom_build:
label: Is the house custom built?
scale: nominal
value: yes/no
corner_lot:
label: Is the house on a corner lot?
scale: nominal
value: yes/no
---
title: "Template for 5501-01 programming assignment"
author: "Steve Simon"
format:
html:
embed-resources: true
date: 2024-08-18
---
This program reads data on housing prices in Albuquerque, New Mexico in 1993. Find more information in the [data dictionary][dd].
[dd]: https://github.com/pmean/datasets/blob/master/albuquerque-housing.yaml
This code is placed in the public domain.
## Load the tidyverse library
For most of your programs, you should load the tidyverse library. The messages and warnings are suppressed.
```{r setup}
#| message: false
#| warning: false
library(tidyverse)
```
## Read the data and view a brief summary
Use the read_csv function to read the data. The glimpse function will produce a brief summary.
```{r read}
alb <- read_csv(
file="../data/albuquerque-housing.csv",
col_names=TRUE,
col_types="nnnnccc",
na=".")
glimpse(alb)
```
## Calculate overall means
The summarize_if function produces means, but only for numeric data. You wouldn't want to compute means for data with values "yes" and "no".
```{r means}
alb |>
summarise_if(is.numeric, mean, na.rm = TRUE)
```
## Summarize price
The average price of a home, 106 thousand dollars, is quite low because the data comes from 1993.
## Summarize sqft
## Summarize age
## Summarize features